This document summarises and discusses the procedures used to fit linear mixed models to our phenotypic data.
These models were used to estimate the proportion of genetic variance that is attributed to plasticity (GxE) as well as to produce estimates of heritability.
We use linear mixed models to explore the following questions:
Our definition of “plasticity” is simply the difference between the mean trait values on HN - LN.
We present the analysis for each dataset and trait separately. The first section, on height, is more detailed, whereas the other sections are less detailed as they essentially repeat the same procedure.
# Load packages
library(tidyverse)
library(lme4)
library(broom)
library(patchwork)
# Change ggplot2 default aesthetics
theme_set(theme_classic() +
theme(panel.grid = element_blank(),
text = element_text(size = 8),
plot.title = element_text(hjust = -0.07, size = 10, face = "bold",
margin = margin(t = -5, b = 1)),
plot.subtitle = element_text(size = 10, hjust = 0.5)))
# Source some custom functions
source("./functions/plotLmmDiag.R")
source("./functions/extractVarsLmm.R")
There are two main datasets: accessions and MAGIC lines.
Here, we load both the individual-based data and the data summarised by line (with plasticities calculated):
# Read MAGIC line data
magic_ind <- read_rds("../../data/phenotypes/magic_individual.rds")
magic_plas <- read_rds("../../data/phenotypes/magic_plasticity.rds")
# Read Accession data (note there is also senescence data for this set)
acc_ind <- read_rds("../../data/phenotypes/accessions_individual.rds") %>%
filter(set == "silique")
acc_plas <- read_rds("../../data/phenotypes/accessions_plasticity.rds") %>%
filter(set == "silique")
acc_ind_sen <- read_rds("../../data/phenotypes/accessions_individual.rds") %>%
filter(set == "senescence")
acc_plas_sen <- read_rds("../../data/phenotypes/accessions_plasticity.rds") %>%
filter(set == "senescence")
# For modeling purposes, set the variable nitrate as a factor, with LN as the reference value
magic_ind$nitrate <- factor(magic_ind$nitrate, levels = c("LN", "HN"))
acc_ind$nitrate <- factor(acc_ind$nitrate, levels = c("LN", "HN"))
acc_ind_sen$nitrate <- factor(acc_ind_sen$nitrate, levels = c("LN", "HN"))
We start by looking at height, which in this population of plants has a simple genetic basis (due to the Ler erecta mutation). This trait is also reasonably normally distributed, making us more confident of any results from a linear model (with gaussian assumptions about error distribution).
magic_ind %>%
ggplot(aes(height)) +
geom_histogram() +
facet_grid(~ nitrate)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Because we measure the same genetically identical line (genotype) several times in each treatment, we consider that these lines might have different mean values for the trait being considered (this might not be true - individuals could vary regardless of their genotype, i.e. the heritability would be zero). Here, we start with this basic assumption as our null model.
We therefore specify a linear mixed model, including genotype (i.e. the MAGIC line IDs) as a random term - let’s call it the “genetic model”:
# Null model (variance due to genotype only)
m0_magic_height <- lmer(height ~ (1|id_leyser), data = magic_ind)
Before analysing the result, we check some diagnostic plots to assess the suitability of the model to our data.
# plot diagnostics
plotLmmDiag(m0_magic_height)
From the Q-Q plot, we can see that the residuals fit with a normal distribution quite well, although not perfectly at the extremes of the distribution. From the scale-location, we can note that the variance shows some heteroskedasticity (the variance seems to increase with increased height).
The fitted values for this “genetic model” match, approximately, with the mean trait values of each line calculated directly from the data:
magic_ind %>%
group_by(id_leyser) %>%
summarise(mean_height = mean(height, na.rm = TRUE)) %>%
.$mean_height %>%
plot(coef(m0_magic_height)[[1]][,1], xlab = "Mean Height", ylab = "Coeficients from genetic model")
abline(0, 1, col = "red3")
We can examine a summary of this model:
summary(m0_magic_height)
## Linear mixed model fit by REML ['lmerMod']
## Formula: height ~ (1 | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 32895.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8148 -0.6486 -0.0703 0.5714 4.0470
##
## Random effects:
## Groups Name Variance Std.Dev.
## id_leyser (Intercept) 12.65 3.557
## Residual 23.01 4.797
## Number of obs: 5370, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 18.5064 0.1954 94.72
We can see that there is a substantial amount of variance attributed to differences between genotypes (~12.65/(12.65+23.01) ~ 35%).
Being happy with the assumptions of the basic model, we proceed with a model that incorporates the nitrate treatment as a fixed effect - let’s name it the “plasticity model”.
This model tries to estimate the overall (marginal) effect that nitrate has on height:
m1_magic_height <- lmer(height ~ nitrate + (1|id_leyser), data = magic_ind)
summary(m1_magic_height)
## Linear mixed model fit by REML ['lmerMod']
## Formula: height ~ nitrate + (1 | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 32539.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.7071 -0.6595 -0.0554 0.6071 4.3379
##
## Random effects:
## Groups Name Variance Std.Dev.
## id_leyser (Intercept) 12.74 3.569
## Residual 21.43 4.629
## Number of obs: 5370, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 17.2692 0.2055 84.03
## nitrateHN 2.4422 0.1267 19.28
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.312
From the summary above, the model suggests that, on average, plants on HN are ~2.4 cm taller than those on LN. We can confirm this, by comparing it with the means of each of these groups in the original data:
magic_ind %>%
group_by(nitrate) %>%
summarise(mean_height = mean(height, na.rm = TRUE))
## # A tibble: 2 x 2
## nitrate mean_height
## <fct> <dbl>
## 1 LN 17.3
## 2 HN 19.7
We can also estimate the overall amount of variance explained by the model, computing the R2 following the method suggested by Nakagawa & Schielzeth 2010) and implemented in the MuMIn package. This gives us two R2: marginal and conditional. The first one estimates the variance explained by the fixed effects only (nitrate in our case) while the second the overal amount of variance explained by both fixed and random effects:
MuMIn::r.squaredGLMM(m1_magic_height)
## R2m R2c
## 0.04181918 0.39910039
We can see that only ~4% of the variance is due to nitrate, but overall the model explains ~40% of the phenotypic variance.
We can also test if this “plasticity model” explains a significant amount of variance compared to the previous “genetic model”:
anova(m0_magic_height, m1_magic_height) %>% tidy
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m0_magic_height 3 32900.31 32920.08 -16447.16 32894.31 NA NA
## 2 m1_magic_height 4 32543.68 32570.04 -16267.84 32535.68 358.6264 1
## p.value
## 1 NA
## 2 5.606323e-80
The result suggests that the likelihood difference between the two models is very unlikely assuming they both explain a similar amount of variance in height.
However, it’s worth noting that the interpretation of p-values should be done with care. Our sample sizes are quite big, so even small effects can increase the likelihood of the model enough and thus lead to very small p-value (this will be obvious below on flowering time). Although we report the results for these likelihood ratio tests here, they’re actually rather uninformative for our purpose, which is to explore how the variance in phenotype can be partitioned into genetic and non-genetic components.
The previous model assumes that every genotype responds equally to the nitrate (i.e. that every MAGIC line is, on average, 2.4cm taller on HN). But this is not necessarily true if there is variation for plasticity. To account for this variation, we introduce an interaction term in our model - let’s call it the “GxE” model. We do so by fitting a random slopes model:
m2_magic_height <- lmer(height ~ nitrate + (nitrate|id_leyser), data = magic_ind)
summary(m2_magic_height)
## Linear mixed model fit by REML ['lmerMod']
## Formula: height ~ nitrate + (nitrate | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 32369.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6430 -0.6414 -0.0481 0.5794 4.4174
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id_leyser (Intercept) 10.511 3.242
## nitrateHN 6.573 2.564 0.08
## Residual 19.673 4.435
## Number of obs: 5370, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 17.2633 0.1888 91.44
## nitrateHN 2.4609 0.1801 13.67
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.167
We can see that now there is an appreciable proportion of variance allocated to the difference between LN and HN (~6.6, which is 6.6/(10.5+6.6) ~ 39% of the genetic variance), suggestting that some lines vary more than others between the two treatments.
Here’s the likelihood ratio test comparing this “GxE” model and the “plasticity model”:
anova(m1_magic_height, m2_magic_height) %>% tidy
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m1_magic_height 4 32543.68 32570.04 -16267.84 32535.68 NA NA
## 2 m2_magic_height 6 32377.96 32417.49 -16182.98 32365.96 169.725 2
## p.value
## 1 NA
## 2 1.395369e-37
This can be better visualised by plotting the reaction norms (which are qualitatively similar between the model estimates and from the data):
p1 <- coef(m2_magic_height)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "height", 1:2) %>%
ggplot(aes(nitrate, height, group = id)) +
geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- magic_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_height = mean(height, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_height, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
Again, we can get an estimate of the variance explained by the fixed effect (nitrate) and all effects together:
MuMIn::r.squaredGLMM(m2_magic_height)
## R2m R2c
## 0.04238779 0.44924148
We can see that adding the interaction increased the explained variance from ~40% to ~45%.
Based on this model, we partition the variance in height into different components:
It should be noted that it is possible that the means in each environment are correlated with the difference between environments (plasticity) and so variance partitioning in this context should be interpreted with care.
We extract variance estimates from the model using a custom function, which uses the methods presented in Brommer (2013):
extractVarsLmm(m2_magic_height)
## var_ln var_hn var_gen var_nitrate var_plas var_res cor_ln_plas
## R2m 10.51149 18.4675 14.4895 1.514055 6.5728 19.67262 0.08320514
## cor_ln_hn r2_fixed r2_all
## R2m 0.8040848 0.04238779 0.4492415
Visually:
extractVarsLmm(m2_magic_height) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
It’s worth pointing that, despite ~16% of genetic variance being atributed to plasticity in this model, the trait is correlated between environments, with an estimated correlation of 0.66 from the data.
The distribution for this trait is highly right-skewed:
magic_ind %>%
ggplot(aes(bolt)) +
geom_histogram(binwidth = 1) +
facet_grid(~ nitrate)
## Warning: Removed 5 rows containing non-finite values (stat_bin).
This gives some problems using the “genetic model” assuming gaussian errors:
m0_magic_bolt <- lmer(bolt ~ (1|id_leyser), data = magic_ind)
plotLmmDiag(m0_magic_bolt)
The residuals are problematic here, with a very high skew towards the tails in the Q-Q plot.
We can also see that the variance increases substantially, as the flowering time increases. This is partially due to the fact that late flowering plants on LN start to show developmental defects or phisiological signs of stress (e.g. accumulation of anthocyanin in the leaves), which probably affects the quality of the data.
We therefore log-transform the data, to reduce the skew:
# log-transform data
magic_ind$boltLog <- log2(magic_ind$bolt)
# plot
ggplot(magic_ind, aes(boltLog)) +
geom_histogram(binwidth = 0.2) +
facet_grid(~ nitrate)
## Warning: Removed 5 rows containing non-finite values (stat_bin).
In the model, this substantially attenuates the heteroskedasticity and departure from normality (although still skewed at the tails):
m0_magic_bolt <- lmer(boltLog ~ (1|id_leyser), data = magic_ind)
plotLmmDiag(m0_magic_bolt)
The “scale-location plot” shows a less dramatic increase in variance along the fitted values, so we will proceed building upon this model.
We could use a generalised linear mixed model (GLMM), although extracting variance components is more challenging in that case. However, we show below how that approach gives qualitatively similar results.
# Fit "plasticity model"
m1_magic_bolt <- lmer(boltLog ~ nitrate + (1|id_leyser), data = magic_ind)
summary(m1_magic_bolt)
## Linear mixed model fit by REML ['lmerMod']
## Formula: boltLog ~ nitrate + (1 | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 3319.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.6689 -0.5773 -0.0875 0.5021 6.6957
##
## Random effects:
## Groups Name Variance Std.Dev.
## id_leyser (Intercept) 0.21900 0.4680
## Residual 0.08418 0.2901
## Number of obs: 5365, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 4.474588 0.024854 180.04
## nitrateHN -0.049160 0.007946 -6.19
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.162
Note that, because of the new model specification, we need to interpret the estimates by converting back to the original units, which we can do as follows:
2^(fixef(m1_magic_bolt))
## (Intercept) nitrateHN
## 22.2323417 0.9664989
According to this model, plants flower on average ~1 day later on HN than on LN. Let’s check this in the original data:
magic_ind %>%
group_by(nitrate) %>%
summarise(mean_bolt = mean(bolt, na.rm = TRUE))
## # A tibble: 2 x 2
## nitrate mean_bolt
## <fct> <dbl>
## 1 LN 24.3
## 2 HN 23.0
The estimate deviates a bit from the original data. This is because these marginal means are likely influenced by the very late-flowering individuals on LN, which skew the mean upwards. (this can be seen in the reaction norm plots, below). Ignoring those late-flowering lines, there is actually very little change between the N treatments.
Finally, the likelihood ratio test:
# Contrast the genetic and plasticity models
anova(m0_magic_bolt, m1_magic_bolt) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m0_magic_bolt 3 3350.643 3370.406 -1672.321 3344.643 NA NA
## 2 m1_magic_bolt 4 3314.503 3340.854 -1653.252 3306.503 38.13944 1
## p.value
## 1 NA
## 2 6.586511e-10
Although the p-value of this test is small, the effect size from the model is very small (an average of 1 day of flowering difference seems biologically insignificant).
We can now fit the random slopes model, to account for variation in plasticity, our “GxE” model:
m2_magic_bolt <- lmer(boltLog ~ nitrate + (nitrate|id_leyser), data = magic_ind)
summary(m2_magic_bolt)
## Linear mixed model fit by REML ['lmerMod']
## Formula: boltLog ~ nitrate + (nitrate | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 3272.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.6618 -0.5846 -0.0753 0.4933 6.7365
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id_leyser (Intercept) 0.245929 0.4959
## nitrateHN 0.004665 0.0683 -0.82
## Residual 0.082939 0.2880
## Number of obs: 5365, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 4.474831 0.026255 170.44
## nitrateHN -0.049734 0.008647 -5.75
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.468
# Contrast plasticity and GxE models
anova(m1_magic_bolt, m2_magic_bolt) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m1_magic_bolt 4 3314.503 3340.854 -1653.252 3306.503 NA NA
## 2 m2_magic_bolt 6 3271.384 3310.910 -1629.692 3259.384 47.11883 2
## p.value
## 1 NA
## 2 5.865103e-11
Again, despite the very low p-value of the test, from the summary above, we can see that the variance attributed to genetic plasticity is actually quite low.
From the reaction norm plots, we can see that most variation comes from the late-flowering lines which seem to flower quite earlier on HN than LN:
p1 <- coef(m2_magic_bolt)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "bolt", 1:2) %>%
ggplot(aes(nitrate, exp(bolt), group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction", y = "bolt")
p2 <- magic_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_bolt = mean(bolt, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_bolt, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
There is one important observation to make: there is a negative correlation between the mean values on LN and the plasticity (this is shown in the model as a correlation of -0.63 between the intercept and slopes). In other words, late flowering plants on LN have a lower slope than those that flower earlier. This is visible in the plot above, as the negative slopes in the reaction norm for late flowering lines on LN (these are plants struggling after more than a month on low nitrate feeding).
The two traits are highly correlated in the two nitrate treatments, which biologically suggests low variation in plasticity for this trait.
The correlation is 0.9418572 from the data.
As before, we can partition the variance:
extractVarsLmm(m2_magic_bolt) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
Confirming very little variation due nitrate and so, biologically, there is basically no plasticity.
As a comparison, this shows how the result from a GLMM assuming Poisson-distributed errors, is qualitatively similar.
In this case we also add an “overdispersion” component, to allow estimation of individual-level (residual) variance (following section IV in Nakagawa & Schielzeth 2010):
magic_ind$id_unique <- 1:nrow(magic_ind)
m2_magic_bolt_glmm <- glmer(bolt ~ nitrate + (nitrate|id_leyser) + (1|id_unique),
data = magic_ind, family = "poisson", na.action = "na.exclude")
extractVarsLmm(m2_magic_bolt_glmm) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
We can see that the result is qualitatively similar to the one obtained using the log-transformed data above.
# Plot distribution
magic_ind %>%
ggplot(aes(totalbr)) +
geom_histogram(binwidth = 1) +
facet_grid(~ nitrate)
Although these are count data, and seem to show non-normality in particular on LN, we start with the simpler gaussian model:
m0_magic_totalbr <- lmer(totalbr ~ (1|id_leyser), data = magic_ind)
plotLmmDiag(m0_magic_totalbr)
This is actually surprising: the residuals fit reasonably well with a gaussian distribution (except towards the tails, as usual) and their variance does not seem to increase with the fitted values.
So we will keep using the LMM using gaussian errors, although we try a GLMM model below as well to show it produces qualitatively similar results.
m1_magic_totalbr <- lmer(totalbr ~ nitrate + (1|id_leyser), data = magic_ind)
summary(m1_magic_totalbr)
## Linear mixed model fit by REML ['lmerMod']
## Formula: totalbr ~ nitrate + (1 | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 20989.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9056 -0.5888 -0.0132 0.5426 7.4926
##
## Random effects:
## Groups Name Variance Std.Dev.
## id_leyser (Intercept) 0.7611 0.8724
## Residual 2.5987 1.6121
## Number of obs: 5370, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.42615 0.05499 44.12
## nitrateHN 2.24755 0.04410 50.96
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.406
The above suggests that plants on HN make, on average, ~2 branches extra than on LN. This matches with the marginal means calculated from the data:
magic_ind %>%
group_by(nitrate) %>%
summarise(mean_totalbr = mean(totalbr, na.rm = TRUE))
## # A tibble: 2 x 2
## nitrate mean_totalbr
## <fct> <dbl>
## 1 LN 2.42
## 2 HN 4.67
We can formally test if this model is better than the simpler “genetic model” [ but by now we know that we don’t like p-values :) ]:
anova(m0_magic_totalbr, m1_magic_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m0_magic_totalbr 3 23080.20 23099.96 -11537.10 23074.20 NA
## 2 m1_magic_totalbr 4 20988.97 21015.33 -10490.49 20980.97 2093.225
## Chi.Df p.value
## 1 NA NA
## 2 1 0
And so finally we move on to our “GxE” random slopes model, assuming genetic variation for plasticity:
m2_magic_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = magic_ind)
summary(m2_magic_totalbr)
## Linear mixed model fit by REML ['lmerMod']
## Formula: totalbr ~ nitrate + (nitrate | id_leyser)
## Data: magic_ind
##
## REML criterion at convergence: 20566.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0960 -0.5596 -0.0105 0.4933 8.1110
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id_leyser (Intercept) 1.056 1.028
## nitrateHN 1.606 1.267 -0.52
## Residual 2.171 1.473
## Number of obs: 5370, groups: id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.42091 0.06049 40.02
## nitrateHN 2.25601 0.07707 29.27
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.564
From the above we can already see that the variance in slopes is higher than the genotypic variance on LN. We don’t need the test, but here it is:
anova(m1_magic_totalbr, m2_magic_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m1_magic_totalbr 4 20988.97 21015.33 -10490.49 20980.97 NA
## 2 m2_magic_totalbr 6 20570.75 20610.28 -10279.38 20558.75 422.2216
## Chi.Df p.value
## 1 NA NA
## 2 2 2.068971e-92
Finally, let’s visualise the reaction norms:
p1 <- coef(m2_magic_totalbr)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "totalbr", 1:2) %>%
ggplot(aes(nitrate, totalbr, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction", y = "log(totalbr)")
p2 <- magic_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_totalbr = mean(totalbr, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_totalbr, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
The plots are qualitatively similar between the model and the data. We can see the cross-over interaction, such that some lines have positive plasticity, while other have low plasticity.
Again, there is another important thing to mention: the correlation between the intercepts (mean branches on LN) and the slopes (plasticity), which is negative at -0.52.
This means that plants that make more branches on LN are also less plastic than those that make few branches. We explore this further in Fig. 3-4 of the paper. For now, it’s worth to consider that the variance in plasticity is much higher than in the above cases for flowering time and height:
extractVarsLmm(m2_magic_totalbr) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
As noted before, although the diagnostics suggest a LMM with Gaussian assumptions might not be absurd for these data, we confirm the above findings with a general linear mixed model, assuming Poisson errors.
We note that, biologically, shoot branch counts do not readily fit as a Poisson phenomenon, since the activation of each shoot (the “event” in the Poisson) is not an independent event, but rather depends on wether or not other branches in the stem activated, following a strict basipetal sequence of activation.
Despite this, we attempt the GLMM for the full random slopes model:
m2_glmm <- glmer(totalbr ~ nitrate + (nitrate|id_leyser) + (1|id_unique), data = magic_ind,
family = "poisson")
summary(m2_glmm)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: poisson ( log )
## Formula: totalbr ~ nitrate + (nitrate | id_leyser) + (1 | id_unique)
## Data: magic_ind
##
## AIC BIC logLik deviance df.resid
## 20744.7 20784.2 -10366.3 20732.7 5364
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.4431 -0.5093 -0.0086 0.4101 8.5019
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id_unique (Intercept) 0.0000 0.0000
## id_leyser (Intercept) 0.2338 0.4835
## nitrateHN 0.1841 0.4291 -0.88
## Number of obs: 5370, groups: id_unique, 5370; id_leyser, 374
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.77928 0.02880 27.06 <2e-16 ***
## nitrateHN 0.73727 0.02794 26.38 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## nitrateHN -0.859
Analysing the summary of the model reveals that, qualitatively, the above conclusions hold: The variance in plasticity is on the same order as the variance within the LN (although now proportionately lower than before). The effect of HN is still positive, if we transform it to the original scale, it’s:
exp(fixef(m2_glmm))
## (Intercept) nitrateHN
## 2.179905 2.090229
Which is not too different from the Gaussian model.
And, finally, the reaction norms show a similar “criss-cross” pattern, showing cross-over interaction (note that the y-axis is the latent scale of the model, not the original scale):
coef(m2_glmm)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "totalbr", 1:2) %>%
ggplot(aes(nitrate, totalbr, group = id)) + geom_line(alpha = 0.1)
And partitioning the variance:
extractVarsLmm(m2_glmm) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
## Warning in r.squaredGLMM.merMod(model): exp(beta0) of 3.2 is too close to zero, estimate may be unreliable
## Warning in r.squaredGLMM.merMod(model): exp(beta0) of 3.2 is too close to zero, estimate may be unreliable
In conclusion, we can say that there is a significant proportion of variation in shoot branching plasticity, compared with height and flowering time. The “criss-cross” pattern aluded to above has a significant biological consequence, which is discussed further in the paper.
The same procedure as above is used for the accessions. Therefore, the individual steps are explained more sparingly.
# Genetic model
m0_acc_height <- lmer(height ~ (1|id_leyser), data = acc_ind)
plotLmmDiag(m0_acc_height)
Diagnostics look OK (although there are two extreme outliers!).
The “plasticity” model:
m1_acc_height <- lmer(height ~ nitrate + (1|id_leyser), data = acc_ind)
anova(m0_acc_height, m1_acc_height) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m0_acc_height 3 24285.89 24305.09 -12139.95 24279.89 NA NA
## 2 m1_acc_height 4 23595.63 23621.24 -11793.82 23587.63 692.2599 1
## p.value
## 1 NA
## 2 1.441602e-152
The “GxE” model:
m2_acc_height <- lmer(height ~ nitrate + (nitrate|id_leyser), data = acc_ind)
anova(m1_acc_height, m2_acc_height) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m1_acc_height 4 23595.63 23621.24 -11793.82 23587.63 NA NA
## 2 m2_acc_height 6 23514.54 23552.94 -11751.27 23502.54 85.09359 2
## p.value
## 1 NA
## 2 3.327826e-19
Reaction norm plots:
p1 <- coef(m2_acc_height)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "height", 1:2) %>%
ggplot(aes(nitrate, height, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- acc_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_height = mean(height, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_height, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
Variance components:
extractVarsLmm(m2_acc_height) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
These data do not have as extreme outliers as in the MAGIC lines:
ggplot(acc_ind, aes(bolt)) +
geom_histogram(binwidth = 2) +
facet_grid(~ nitrate)
## Warning: Removed 80 rows containing non-finite values (stat_bin).
The distribution is still skewed to the right, which we might solve by using log-transformed values. But first I consider the normal LMM:
# Genetic model
m0_acc_bolt <- lmer(bolt ~ (1|id_leyser), data = acc_ind)
plotLmmDiag(m0_acc_bolt)
The variance heteroskedasticity does not seem as extreme as before, although it’s still increasing with fitted values.
Using the transformed values reduces the trend slightly:
acc_ind$boltLog <- log2(acc_ind$bolt)
m0_acc_bolt <- lmer(boltLog ~ (1|id_leyser), data = acc_ind)
plotLmmDiag(m0_acc_bolt)
For consistency with the MAGIC line data, we proceed with the transformed values:
The “plasticity” model:
m1_acc_bolt <- lmer(boltLog ~ nitrate + (1|id_leyser), data = acc_ind)
anova(m0_acc_bolt, m1_acc_bolt) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m0_acc_bolt 3 -4202.591 -4183.441 2104.295 -4208.591 NA NA
## 2 m1_acc_bolt 4 -4208.643 -4183.110 2108.321 -4216.643 8.051855 1
## p.value
## 1 NA
## 2 0.004545708
The “GxE” model:
# Note that changed the optimizer as the model has problems converging with the default one
m2_acc_bolt <- lmer(boltLog ~ nitrate + (nitrate|id_leyser), data = acc_ind,
control = lmerControl(optimizer ="Nelder_Mead"))
anova(m1_acc_bolt, m2_acc_bolt) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m1_acc_bolt 4 -4208.643 -4183.110 2108.321 -4216.643 NA NA
## 2 m2_acc_bolt 6 -4209.044 -4170.745 2110.522 -4221.044 4.401777 2
## p.value
## 1 NA
## 2 0.1107047
The model essentially predicts no plasticity for this trait, which is readily visible in the reaction norm plots:
p1 <- coef(m2_acc_bolt)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "bolt", 1:2) %>%
ggplot(aes(nitrate, bolt, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- acc_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_bolt = mean(bolt, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_bolt, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
extractVarsLmm(m2_acc_bolt) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
# Genetic model
m0_acc_totalbr <- lmer(totalbr ~ (1|id_leyser), data = acc_ind)
plotLmmDiag(m0_acc_totalbr)
Diagnostics look OK-ish.
The “plasticity” model:
m1_acc_totalbr <- lmer(totalbr ~ nitrate + (1|id_leyser), data = acc_ind)
anova(m0_acc_totalbr, m1_acc_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m0_acc_totalbr 3 18062.27 18081.48 -9028.137 18056.27 NA NA
## 2 m1_acc_totalbr 4 16240.44 16266.05 -8116.222 16232.44 1823.831 1
## p.value
## 1 NA
## 2 0
The “GxE” model:
m2_acc_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = acc_ind)
anova(m1_acc_totalbr, m2_acc_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic Chi.Df
## 1 m1_acc_totalbr 4 16240.44 16266.05 -8116.222 16232.44 NA NA
## 2 m2_acc_totalbr 6 16003.34 16041.75 -7995.671 15991.34 241.1018 2
## p.value
## 1 NA
## 2 4.419837e-53
Reaction norm plots:
p1 <- coef(m2_acc_totalbr)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "totalbr", 1:2) %>%
ggplot(aes(nitrate, totalbr, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- acc_ind %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_totalbr = mean(totalbr, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_totalbr, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
extractVarsLmm(m2_acc_totalbr) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
Again, this is the trait with highest GxE.
# Genetic model
m0_accsen_totalbr <- lmer(totalbr ~ (1|id_leyser), data = acc_ind_sen)
plotLmmDiag(m0_accsen_totalbr)
Diagnostics look OK-ish (one extreme outlier).
The “plasticity” model:
m1_accsen_totalbr <- lmer(totalbr ~ nitrate + (1|id_leyser), data = acc_ind_sen)
anova(m0_accsen_totalbr, m1_accsen_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m0_accsen_totalbr 3 16431.55 16450.53 -8212.777 16425.55 NA
## 2 m1_accsen_totalbr 4 14000.11 14025.42 -6996.056 13992.11 2433.441
## Chi.Df p.value
## 1 NA NA
## 2 1 0
The “GxE” model:
m2_accsen_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = acc_ind_sen)
anova(m1_accsen_totalbr, m2_accsen_totalbr) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m1_accsen_totalbr 4 14000.11 14025.42 -6996.056 13992.11 NA
## 2 m2_accsen_totalbr 6 13867.66 13905.62 -6927.832 13855.66 136.4494
## Chi.Df p.value
## 1 NA NA
## 2 2 2.346347e-30
Reaction norm plots:
p1 <- coef(m2_accsen_totalbr)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "totalbr", 1:2) %>%
ggplot(aes(nitrate, totalbr, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- acc_ind_sen %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_totalbr = mean(totalbr, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_totalbr, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
extractVarsLmm(m2_accsen_totalbr) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
The variance due to Nitrate increased relatively to 2-silique stage. There is still ~13% GxE variance.
# Square-root transform
acc_ind_sen <- acc_ind_sen %>%
mutate(totalsil = as.numeric(totalsil),
totalsilSqrt = sqrt(totalsil))
ggplot(acc_ind_sen, aes(totalsilSqrt)) +
geom_histogram() +
facet_grid(~ nitrate)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Genetic model
m0_accsen_totalsil <- lmer(totalsil ~ (1|id_leyser), data = acc_ind_sen)
plotLmmDiag(m0_accsen_totalsil)
The diagnostics suggest some heteroskedasticity (variance increases with fitted values).
Square-root transforming the values attenuates this somewhat:
m0_accsen_totalsil <- lmer(totalsilSqrt ~ (1|id_leyser), data = acc_ind_sen)
plotLmmDiag(m0_accsen_totalsil)
So we proceed with the transformed variable.
The “plasticity” model:
m1_accsen_totalsil <- lmer(totalsilSqrt ~ nitrate + (1|id_leyser), data = acc_ind_sen)
anova(m0_accsen_totalsil, m1_accsen_totalsil) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m0_accsen_totalsil 3 19278.77 19297.74 -9636.383 19272.77 NA
## 2 m1_accsen_totalsil 4 16669.86 16695.17 -8330.932 16661.86 2610.901
## Chi.Df p.value
## 1 NA NA
## 2 1 0
The “GxE” model:
m2_accsen_totalsil <- lmer(totalsilSqrt ~ nitrate + (nitrate|id_leyser), data = acc_ind_sen)
anova(m1_accsen_totalsil, m2_accsen_totalsil) %>% tidy()
## refitting model(s) with ML (instead of REML)
## Warning in tidy.anova(.): The following column names in ANOVA output were
## not recognized or transformed: AIC, BIC, logLik, deviance, Chi.Df
## term df AIC BIC logLik deviance statistic
## 1 m1_accsen_totalsil 4 16669.86 16695.17 -8330.932 16661.86 NA
## 2 m2_accsen_totalsil 6 16514.66 16552.62 -8251.331 16502.66 159.2017
## Chi.Df p.value
## 1 NA NA
## 2 2 2.690177e-35
Reaction norm plots:
p1 <- coef(m2_accsen_totalsil)$id_leyser %>%
mutate(id = rownames(.),
nitrateHN = `(Intercept)` + nitrateHN) %>%
gather("nitrate", "totalsil", 1:2) %>%
ggplot(aes(nitrate, totalsil, group = id)) + geom_line(alpha = 0.1) +
labs(title = "Model prediction")
p2 <- acc_ind_sen %>%
group_by(id_leyser, nitrate) %>%
summarise(mean_totalsil = mean(totalsil, na.rm = TRUE)) %>%
ggplot(aes(nitrate, mean_totalsil, group = id_leyser)) +
geom_line(alpha = 0.1) +
labs(title = "Means from data")
p1 + p2
extractVarsLmm(m2_accsen_totalsil) %>%
select(var_gen, var_nitrate, var_plas, var_res) %>%
gather("component", "variance") %>%
mutate(variance = variance/sum(variance)*100) %>%
ggplot(aes(component, variance, label = round(variance, 1))) +
geom_bar(stat = "identity", fill = "grey", colour = "black") +
labs(y = "% variance") +
geom_text(size = 3, position = position_stack(vjust = 0.5))
This is quite similar to that obtained for total branches, which is consistent with the two traits being correlated (shown in Figure S2 of the paper).
Based on the above models, we partition the variance of all the models, to get an idea of the proportion of variance due to genetic and non-genetic components, in particular the genetic component of plasticity (GxE effect).
For convenience only, we re-specify all the final models here:
# Transform variables
magic_ind$boltLog <- log2(magic_ind$bolt)
acc_ind$boltLog <- log2(acc_ind$bolt)
acc_ind_sen$totalsilSqrt <- sqrt(acc_ind_sen$totalsil)
# MAGIC line models
m2_magic_height <- lmer(height ~ nitrate + (nitrate|id_leyser), data = magic_ind)
m2_magic_bolt <- lmer(boltLog ~ nitrate + (nitrate|id_leyser), data = magic_ind)
m2_magic_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = magic_ind)
# Accession 2-silique models
m2_acc_height <- lmer(height ~ nitrate + (nitrate|id_leyser), data = acc_ind)
m2_acc_bolt <- lmer(boltLog ~ nitrate + (nitrate|id_leyser), data = acc_ind, control = lmerControl(optimizer = "Nelder_Mead"))
m2_acc_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = acc_ind)
# Accession senescence models
m2_accsen_totalbr <- lmer(totalbr ~ nitrate + (nitrate|id_leyser), data = acc_ind_sen)
m2_accsen_totalsil <- lmer(totalsilSqrt ~ nitrate + (nitrate|id_leyser), data = acc_ind_sen)
Using a custom function, we extract the variance components, and save it into a table:
# Use custom function to partition variance.
# See the function script for details on how these components are extracted
magic_trait_vars <- list(bolt = extractVarsLmm(m2_magic_bolt),
height = extractVarsLmm(m2_magic_height),
branches = extractVarsLmm(m2_magic_totalbr)) %>%
bind_rows(.id = "trait") %>%
gather("component", "variance", var_gen, var_plas, var_res, var_nitrate) %>%
mutate(set = "MAGIC lines")
acc_trait_vars <- list(bolt = extractVarsLmm(m2_acc_bolt),
height = extractVarsLmm(m2_acc_height),
branches = extractVarsLmm(m2_acc_totalbr),
branches_sen = extractVarsLmm(m2_accsen_totalbr),
sil_sen = extractVarsLmm(m2_accsen_totalsil)) %>%
bind_rows(.id = "trait") %>%
gather("component", "variance", var_gen, var_plas, var_res, var_nitrate) %>%
mutate(set = "Accessions")
all_trait_vars <- bind_rows(magic_trait_vars, acc_trait_vars)
# Save this table
if(!dir.exists("./temp_data")) dir.create("./temp_data")
if(!file.exists("./temp_data/all_trait_variances.csv")){
all_trait_vars %>%
select(trait, component, variance, set) %>%
write_csv("./temp_data/all_trait_variances.csv")
}
The table above is used to produce the variance-component barplots:
all_trait_vars %>%
mutate(trait = factor(trait, levels = c("bolt", "height", "branches", "branches_sen", "sil_sen")),
set = factor(set, levels = c("MAGIC lines", "Accessions"))) %>%
group_by(set, trait) %>%
mutate(variance = variance/sum(variance)*100,
component = factor(component, levels = c("var_res", "var_plas", "var_nitrate", "var_gen"))) %>%
ungroup() %>%
ggplot(aes(trait, variance, fill = component, label = round(variance))) +
geom_bar(stat = "identity", colour = "black") +
facet_grid(~ set, scales = "free_x", space = "free_x") +
scale_fill_manual(values = c("white", "gray48", "grey", "black")) +
geom_text(size = 3, position = position_stack(vjust = 0.5)) +
labs(y = "% variance", x = "Trait")
It is clear from this plot that shoot branching is the trait with the greater component of plasticity, of which a good part (~26%) is likely due to genetic differences between the lines.
sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.0
##
## locale:
## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
## [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
## [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] bindrcpp_0.2 patchwork_0.0.1 broom_0.4.3
## [4] lme4_1.1-15 Matrix_1.2-12 forcats_0.2.0
## [7] stringr_1.2.0 dplyr_0.7.4 purrr_0.2.4
## [10] readr_1.1.1 tidyr_0.7.2 tibble_1.4.2
## [13] ggplot2_2.2.1.9000 tidyverse_1.2.1
##
## loaded via a namespace (and not attached):
## [1] tidyselect_0.2.3 reshape2_1.4.3 splines_3.4.3
## [4] haven_1.1.1 lattice_0.20-35 colorspace_1.3-2
## [7] stats4_3.4.3 htmltools_0.3.6 yaml_2.1.16
## [10] utf8_1.1.3 rlang_0.1.6.9003 nloptr_1.0.4
## [13] pillar_1.1.0 foreign_0.8-69 glue_1.2.0
## [16] withr_2.1.1.9000 modelr_0.1.1 readxl_1.0.0
## [19] bindr_0.1 plyr_1.8.4 munsell_0.4.3
## [22] gtable_0.2.0 cellranger_1.1.0 rvest_0.3.2
## [25] psych_1.7.8 evaluate_0.10.1 labeling_0.3
## [28] knitr_1.18 parallel_3.4.3 Rcpp_0.12.15
## [31] scales_0.5.0.9000 backports_1.1.2 jsonlite_1.5
## [34] mnormt_1.5-5 hms_0.4.1 digest_0.6.14
## [37] stringi_1.1.6 grid_3.4.3 rprojroot_1.3-2
## [40] cli_1.0.0 tools_3.4.3 magrittr_1.5
## [43] lazyeval_0.2.1 crayon_1.3.4 pkgconfig_2.0.1
## [46] MASS_7.3-48 xml2_1.2.0 MuMIn_1.40.0
## [49] lubridate_1.7.1 minqa_1.2.4 assertthat_0.2.0
## [52] rmarkdown_1.8 httr_1.3.1 rstudioapi_0.7
## [55] R6_2.2.2 nlme_3.1-131 compiler_3.4.3